This Jupyter notebook describes the steps necessary to take to get and plot stock data using Python. There is a standout package in Python called pandas-datareader, which provides simple interface for getting data from Google finance, Yahoo! finance, World Bank etc.
To start using pandas-datareader (assuming it is already installed in our computer), we fist need to import the library. As it has quite long name, we will import it as web (shorter name).
In [1]:
import pandas_datareader.data as web
The DataReader function from above imported library provides the data on stocks available in Google/Yahoo! finance. Hence, the function takes to mandatory arguments: name of the stock (name is a text/string, so it shoulg be in quotes) and name of the website (which is again string, so again we should use quotes).
Let's get the IBM stock data from Google Finance.
In [2]:
data = web.DataReader("IBM","google")
So now, the IBM stock data is already downloaded and saved in our variable called data. To view the first 5 observations/raws of the data, the function head() can be used:
In [3]:
data.head()
Out[3]:
Please note, that the function head() above gives the very first 5 observations by default (whenever we do not explicitly mention anything else inside brackets). If one is interested in viewing the very first 10 observations, that's also doable. THe only necessary step to take is to give 10 as an argument to our head() function:
In [4]:
data.head(10)
Out[4]:
Similarly, one can view the very last 5 or 10 observations by just using the tail() function, instead of head():
In [5]:
data.tail()
Out[5]:
The type of resulted datasets (i.e. the type of the variable called data) is knows as DataFrame as it represents the data inside a frame. We could also learn about that by using the type() function.
In [6]:
type(data)
Out[6]:
The DataFrames are very user friendly types wo work with. Many operatinos on DataFrames are similar to this we did with lists. For example, if one is interested in choosing only one column of the DataFrame, s/he just needs to put square brackets and name of the chosen column side (note, name is a string, so it should be inside quotes):
In [7]:
data["Open"]
Out[7]:
Similarly, one can choose to show only selected rows from the DataFrame (note, this operation work for Year and Month arguments only):
In [8]:
data["2015-05"]
Out[8]:
A specific date range is also acceptable as an input:
In [9]:
data["2015-05":"2016-05"]
Out[9]:
So now let's move forward and plot the data we received. For ploting purposes, the matplotlib.pyplot library is usually used in python. Let's import it first. As it has quite a long name, we will call it plt inside our Jupyter notebook.
In [10]:
import matplotlib.pyplot as plt
Let's first make some sample plot, and then move to our dataset. To make a plot and show it one needs to use two functions from plt: plt.plot() and plt.show().
In [11]:
# a sample plot of bisector line
plt.plot([1,2,3,4],[1,2,3,4])
plt.show()
Please note, that without the plt.show() function the plot would be generated but shown. This is sometimes useful when you want to generate a plot and save it, instead of showing it. However, if one wants to always show the plotted graphs, s/he could just put the following arguments when importing the matplotlib.pyplot library:
''' %matplotlib inline '''
This arguments tells Jupyter notebook to show inline (inside the notebook) all the generated plots. So if one has that argument, there is no need for typing plt.show() every single time.
We do not have it, so we have to show all the plots separately. Let's now plot the highest price of IBM stock.
In [12]:
plt.plot(data["High"])
plt.show()
Let's get the Apple stock data also, and plot it's daily highest price together with IBM (to compare).
In [13]:
data_apple = web.DataReader("AAPL",'google')
Now we have also Apple stock data. To make two plots on the same graph one just needs to have two plt.plot() functions followed by a single plt.show() function in the end:
In [14]:
plt.plot(data["High"])
plt.plot(data_apple["High"])
plt.show()
If you are interested in customization of your plot (colors, apperance etc.) you may check the official tutorial, which provides some nice features available for plt.plot() function (and not only).